The hrp2malaRia package was written by OJ Watson, a PhD Student with the Imperial College of London Mathematical Modeling group. This package was used in his recent Elife Publication, Watson OJ et al. 2017, Modelling the drivers of the spread of Plasmodium falciparum hrp2 gene deletions in sub-Saharan Africa. The ideas and much of theory that we will be discussing today was spear-headed by UNC Physician-Scientist, Jonathan Parr.
Additional thanks to Dayne Filer, Mike Fliss, and Sara Levintow for hacks.
You are an Officer in the CDC’s elite Epidemic Intelligence Service (EIS) Malaria Branch and are called one day by a concerned ministry of health official in a Sub-Saharan Africa stating that their hospitals have seen several patients that appear to have malaria but had negative rapid diagnostic tests. They have read recent reports of hrp2 deletions and wonder if you can model this scenario and predict where they should be on the lookout for hrp2 deletions.
remotes. tidyerverse, RColorBrewer, and cowplot. We will be using remotes to install some packages from Github as well.The malaria life cycle is complex with several different stages. For the purposes of this exercise, it is only important to note that the malaria parasite is transmitted by a mosquito vector that infects the human-host.
In clinical and field settings, malaria is often diagnosed with either the use of light microscopy or rapid diagnostic tests (RDTs). In both cases, a finger-prick of blood is taken from patients and used to screen for malaria. Confirmation of infection by microscopy is made with direct visualization of the parasite by an expert microscopist while RDTs provide a “yes/no” result and can be performed by relatively anyone.
RDTs detect parasite antigens (e.g. a parasite-specific marker) from the patient’s peripheral blood. In the case of the P. falciparum, the vase majority of Pf-RDTs detect a P. falciparum specific protein, histidine-rich protein 2 (hrp2).
However, recent evidence has accumulated to suggest that P. falciparum parasites have undergone selection for hrp2-gene deletions, making them no longer detectable by many RDTs (“stealth parasites”). In theory, by avoiding detection, parasites can then remain untreated and have a higher likelihood of spreading.
Note, for the purposes of our discussions, we will ignore differences between the lower-limits of detection (and sensitivity/specificity) of these two diagnostic modalities.
In regions of high malaria prevalence, it is common for individuals to experience numerous innocuous bites every single evening. For example, in the Democratic Republic of the Congo, it is estimated that the average individual experiences 60-400 infectious mosquito bites/person/year*. As a result of this intense burden, individuals can be infected with numerous different strains of malaria (termed “superinfection” – note, for the purpose of this thought exercise we will ignore “co-transmission” events).
In this lab, we are going to explore different scenarios where hrp2 deletions are expected to increase in frequency (e.g. be “selected for”) or decrease in frequency (e.g. be “selected against”). We will also explore the consequences of these increases through an epidemiological lens.
Before we began, discuss the following points with your neighbors (or as a group):
If you were asked to a design an antigen-based detection method, what type of antigen would you pick (e.g. from an evolutionary point of view, what type of proteins/markers would be best)? _________________________________________________________
In the previous exercise led by Dr. Nunn, you first used a deterministic SIR model to analyze the spread of a “childhood disease” through a susceptible population. You then extended your deterministic SIR model to include fluctuations in the population size through births and deaths (e.g. a demographic model). Next, you modeled the uncertainty and the variation in your model through simulation using a stochastic approach. As a reminder, stochastic individual-based models essentially follow individuals through the different compartments and record their states/times as they move through the model. This explicitly models individual-level heterogeneity and allows for several possible extension/fine-level manipulations (e.g. see network stochastic models, etc.) In short, a deterministic model is a “closed-system” where we will arrive at the same answer every time (given the same parameters, initial conditions, etc). In contrast, a stochastic model let’s us simulate many possible iterations of our model and directly express the randomness (or really uncertainty) in our model parameters and outcome.
In this exercise, we are going to use an individual-based, stochastic malaria transmission model that is an extension of the “Imperial Model” (see OJ’s Material & Methods) for a full description via Griffin et al. 2014, 2015, 2016) OJ describes the Imperial Model fully.
Refer to Figure 5 from OJ’s manuscript. What kind of Compartment Model is the “hrp2-Imperial Model”? ______________________
Note, for the purpose of this exercise, we are going to ignore some of the (beautiful) complexities of the model and instead talk about this model in a Susceptible-Infected-Recovered-Susceptible framework (i.e. the U,A,D compartments are collapsed into the “Infected” category and Recovered corresponds to the period of active treatment/prophylaxis – the T,P compartments – when individuals are “immune”). What is another scenario that SIRS models can be applied to (hint: think about your tetanus vaccines)? __________________
We are going to be focusing on a few adaptations of “hrp2-Imperial Model”, namely:
For each of the parameters listed above, which SIRS compartment will be most affected? ________________
* Note, for the purpose of this exercise, we will ignore cross-reactivity of the hrp3 epitope by setting \(\epsilon\) to 0 and set RDT adherence to 100%. Put simply, this means that if an individual only contains hrp2 deleted parasites, they will not receive treatment (but if they contain any wildtype parasites, they may receive treatment depending on their probability of seeking treatment).
This is copied from the README that accompanies the hrp2malaria package instructions. Please note, this is a subset of potential input variables to change and the output variables of interest.
| Variable | Description |
|---|---|
| years | Number of years simulation is run for. 20 years will be usually long enough to see hrp2 selection at low EIRs. |
| N | Population size. The model is stochastic and so a larger N will enable clearer patterns to be found in model outputs, however, they will take longer to run. Population of 2000 should be a good balance of speed and clarity for demonstration purposes. For the analysis in the eLife modelling paper N was set to 100,000 |
| strains.0 | Starting ratio of strains in the population. This number must be an integer. The default is set to 10, which will mean that on initialization there will be roughly 10 times as many wild type strains as there are hrp2 deleted. |
| EIR | Annual EIR, provided as a fraction, i.e. the default EIR of 20/365 represents an EIR of 20. |
| ft | Frequency of treatment being sought upon developing clinical disease. Must be a number between 0 and 1, with the default being 0.4, i.e. 40% of cases seek treatment. |
| rdt.det | The probability that an individual only infected with hrp2-deleted strains still produces a positive RDT. The default is 1, i.e. no selective pressure. For the eLife paper we used a value of 0.25 that represented the approximate 25% chance that a positive RDT would still be produced due to hrp3 protein epitope cross reactivity |
| rdt.nonadherence | The probability that the result of an RDT is ignored, i.e. a negative RDT due to hrp2 deletions will still be treated. Default = 0, and in the eLife paper we used a value of 10% based on data collected by the WHO. |
| microscopy.use | The proportion of cases that are tested by microscopy rather than RDT. Default = 0, and in the eLife paper we tested the impact of 30% of cases being diagnosed by microscopy, representing the ~70% of cases diagnosed being diagnosed by RDTs in 2014. |
| fitness | Comparative fitness of hrp2 deleted parasites. Default = 1, which mean the wild type is equally as likely to be passed on to mosquitoes as an hrp2 deleted parasite. 0.8 results in the hrp2 deleted strain being 80% likely to be passed on, whereas a value of 1.2 would mean it is 20% more likely. |
| Variable | Description |
|---|---|
| S.Times | The timing of when the simulated population is sampled, i.e. the time at which all other series variables (variables that begin S.) were collected |
| S.Prev.All | The prevalence of malaria within the whole population |
| S.Prev.05 | The prevalence of malaria within under 5s |
| S.N.Dels | The proportion of all parasites that are hrp2 deleted |
| S.N.Dels.05 | The proportion of all parasites within under 5s that are hrp2 deleted |
| S.Incidence | The daily clinical incidence of malaria. Multiply this by 365 to have the expected number of clinical cases of malaria a person within the whole population is expected to have in a year |
| S.Incidence.05 | The daily clinical incidence of malaria. Multiply this by 365 to have the expected number of clinical cases of malaria a child under the age of 5 is expected to have in a year |
| S.Prev.Mono.D | The proportion of infected individuals who are only infected with hrp2 deleted parasites |
| S.Prev.Mono.D.05 | The proportion of infected individuals under the age of 5 who are only infected with hrp2 deleted parasites |
For the purpose of the exercise, we are going to make some more simplifying assumptions. As noted above, we will set the models so that if an individual only contains hrp2 deleted parasites, they will not have a positive RDT, and therefore, will not receive treatment (but if they contain any wildtype parasites, they may receive treatment depending on their probability of seeking treatment). We are also going to set our populations and time steps to 2,000 individuals and 20 years, respectively.
However, we will vary the diagnostic modalities that populations use, the transmission intensity (EIR), and the fitness of the parasites.
First you will need to make sure that you have the remotes packaged installed. This is a wonderful package that allows for R developers to install code from various repositories as a corner-stone of reproducible research. If you don’t already have this package, type install.packages("remotes") into your console (quick download). We will now download the hrp2malaria package that is the foundation of this lab.
# check to make sure user has the needed packages; if missing, install for them
list.of.packages <- c("tidyverse", "remotes", "cowplot", "RColorBrewer")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
remotes::install_github("OJWatson/hrp2malaRia")
remotes::install_github("nickbrazeau/EMSI2019hrp2lab") # hacky way to download this tutorial as a package
remotes::install_github("DavisVaughan/furrr") # we are going to use this for speed later
library(hrp2malaRia)
library(EMSI2019hrp2lab)
library(furrr)
library(tidyverse)
library(RColorBrewer)
library(cowplot)
set.seed(42) # please don't run this line!
Imagine that there are three populations with a constant EIR of 1 (reminder, EIR is the entomological inoculation rate, or the number of infectious mosquito bites per person per year). In addition, all populations start with approximately 10% frequency of hrp2 deletions and hrp2 deleted and wildtype parasites have equal fitness. However, the populations differ by:
Population 1:
Population 2:
Population 3:
As noted above, the hrp2_Simulation function returns many results that allow for many different questions to be investigated. For the purpose of our exercise, we will focus on the increase of hrp2 frequency (df$S.N.Dels) over time (df$S.Times) as well as incidence (df$S.Incidence) over time (df$S.Times)
Let’s run the code (once) and plot the result:
# Simulations take ~1 minute to run on my MacBook Pro (3.1 GHz Intel Core i7)
# Population 1
# as a side note, let's track how long this takes
start_time <- Sys.time()
senario1.pop1 <- hrp2malaRia::hrp2_Simulation(N=2000,
years=20,
strains.0 = 10,
rdt.nonadherence = 0,
EIR=1/365,
microscopy.use = 0,
rdt.det = 0,
fitness = 1
)
## [1] 2.000000000 0.938027691 0.017019987 0.005870036 1.871794872 0.000000000
## [1] 7.370000e+02 9.445417e-01 1.195698e-02 4.419006e-03 4.643766e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.373667e-01 1.690096e-02 6.650007e-03 4.614412e-01
## [6] 2.147059e+01
## [1] 2.207000e+03 9.318295e-01 2.100431e-02 8.083852e-03 4.679487e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.251919e-01 2.610919e-02 9.616674e-03 4.828042e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.239892e-01 2.737176e-02 9.556782e-03 1.442688e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.339786e-01 1.974521e-02 7.193909e-03 2.462888e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.286655e-01 2.372429e-02 8.527926e-03 4.691517e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.355400e-01 1.854760e-02 6.830082e-03 3.753213e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.340496e-01 1.967587e-02 7.192206e-03 4.697555e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 38.512 secs"
# Population 2
senario1.pop2 <- hrp2malaRia::hrp2_Simulation(N=2000,
years=20,
strains.0 = 10,
rdt.nonadherence = 0,
EIR=1/365,
microscopy.use = 0.5,
rdt.det = 0,
fitness = 1
)
## [1] 2.000000000 0.937746094 0.017301584 0.005870036 1.402048656 0.000000000
## [1] 7.370000e+02 9.438160e-01 1.253108e-02 4.570681e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.448219e-01 1.191571e-02 4.180118e-03 9.125000e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.470266e-01 1.043272e-02 3.458366e-03 1.363636e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.425597e-01 1.360800e-02 4.750007e-03 9.034653e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.486319e-01 8.828297e-03 3.457527e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.496720e-01 8.133835e-03 3.111833e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.381591e-01 1.683134e-02 5.927295e-03 9.358974e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.438357e-01 1.261418e-02 4.467828e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.426647e-01 1.347364e-02 4.779366e-03 0.000000e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 37.395 secs"
# Population 3
senario1.pop3 <- hrp2malaRia::hrp2_Simulation(N=2000,
years=20,
strains.0 = 10,
rdt.nonadherence = 0,
EIR=1/365,
microscopy.use = 1,
rdt.det = 0,
fitness = 1
)
## [1] 2.000000000 0.938076341 0.016971337 0.005870036 0.472186287
## [6] 52.142857143
## [1] 737.00000000 0.93667950 0.01817712 0.00606110 0.47096774
## [6] 0.00000000
## [1] 1.472000e+03 9.362278e-01 1.792906e-02 6.760827e-03 9.299363e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.345923e-01 1.964799e-02 6.677386e-03 2.324841e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.445505e-01 1.184586e-02 4.521359e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.406812e-01 1.489725e-02 5.339300e-03 9.113608e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.396121e-01 1.552303e-02 5.782612e-03 4.562500e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.460088e-01 1.110151e-02 3.807386e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.477792e-01 9.845482e-03 3.293024e-03 4.511743e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.484229e-01 9.201337e-03 3.293510e-03 9.034653e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 37.314 secs"
end_time <- Sys.time()
sequentialtime <- end_time - start_time
# Now, let's plot these - all veriables that begin with S. are series variables collected over time.
# let's plot the first population
plot(senario1.pop1$S.Times,senario1.pop1$S.N.Dels, xlab = "time (days)", ylab = "hrp2 deletion Frequency", ylim=c(0,1), col="red", type="l", lwd=2)
# and add the second population
lines(senario1.pop2$S.Times, senario1.pop2$S.N.Dels,lwd=2, col="blue")
# and add the third population
lines(senario1.pop3$S.Times, senario1.pop3$S.N.Dels,lwd=2, col="green")
# and add our legend
legend(2000, .25, legend=c("Only RDT use", "50% RDT use", "0% RDT Use"),
col=c("red", "blue", "green"), lty=1, cex=0.8,
box.lty=2, box.lwd=2)
Given that this is a stochastic model, there will be run-to-run variability of each model iteration. This variability is important in quantifying our uncertainty of the stochastic process that we are modeling.
To get multiple runs of the model, you can simply copy and paste the code above (and rename the output objects [e.g. r1.2], as to not overwrite your previous results). Or, we can make a map dataframe, where we specify the different parameters we want for each run and use this dataframe as the “captain” calling the “plays”.
In this map dataframe, our parameter names will be columns and our parameter options will be cells with each row being its own respective model. Using a map file is a cornerstone of functional programming and a high-yield concept for reproducible research. Plus, as Dr. Nunn noted earlier, stochastic models are typically slower to run than deterministic models. As a result, we now have a need for speed, which is made easier with a map dataframe that we can “loop” through.
Why do you think stochastic models are typically slower to run than deterministic models? _____________________
We are going to use the furrr package to take advantage of the multiple processors on your computer (e.g. parallelize the simulations). This is simply for speed. If you are having trouble on your personal computer, you can change the furrr::future_pmap commands below to purrr::pmap and will get the same result (albeit at a slower pace).
Let’s try it out below:
hrp2map <- data.frame(run = c(1,1,1, 2,2,2, 3,3,3), # note you could use rep here (this is for clarity)
N=2000,
years=20,
strains.0 = 10,
rdt.nonadherence = 0,
EIR=1/365,
microscopy.use = c(0,0.5,1, 0,0.5,1, 0,0.5,1),
rdt.det = 0,
fitness = 1)
Let’s look at what we just made:
dplyr::glimpse(hrp2map)
## Observations: 9
## Variables: 9
## $ run <dbl> 1, 1, 1, 2, 2, 2, 3, 3, 3
## $ N <dbl> 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000,…
## $ years <dbl> 20, 20, 20, 20, 20, 20, 20, 20, 20
## $ strains.0 <dbl> 10, 10, 10, 10, 10, 10, 10, 10, 10
## $ rdt.nonadherence <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0
## $ EIR <dbl> 0.002739726, 0.002739726, 0.002739726, 0.002739…
## $ microscopy.use <dbl> 0.0, 0.5, 1.0, 0.0, 0.5, 1.0, 0.0, 0.5, 1.0
## $ rdt.det <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0
## $ fitness <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1
OK, let’s run our models and store them in our map dataframe.
# note, we have to tell R that scenario and populations aren't parameters that hrp2_Simulation will accept
future::plan(multiprocess)
start_time <- Sys.time()
hrp2map$results <- furrr::future_pmap(hrp2map[,c("N", "years", "strains.0", "rdt.nonadherence", "EIR", "microscopy.use", "rdt.det", "fitness")],
hrp2malaRia::hrp2_Simulation)
## [1] 2.000000000 0.937789037 0.017258642 0.005870036 0.481530343 0.000000000
## [1] 7.370000e+02 9.285488e-01 2.404352e-02 8.325436e-03 4.815303e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.360696e-01 1.839260e-02 6.455492e-03 4.703608e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.349764e-01 1.908321e-02 6.858071e-03 9.772423e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.296419e-01 2.254842e-02 8.727421e-03 1.473755e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.302527e-01 2.270138e-02 7.963609e-03 1.465863e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.260954e-01 2.578641e-02 9.035866e-03 9.681698e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.287237e-01 2.350859e-02 8.685455e-03 9.505208e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.267969e-01 2.503800e-02 9.082796e-03 2.327806e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.229371e-01 2.795968e-02 1.002092e-02 2.315990e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 73.558 secs"
## [1] 2.000000000 0.938093674 0.016954004 0.005870036 0.486018642 0.000000000
## [1] 7.370000e+02 9.416223e-01 1.431508e-02 4.980332e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.387285e-01 1.617671e-02 6.012498e-03 1.975643e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.439633e-01 1.266424e-02 4.290140e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.421197e-01 1.400115e-02 4.796855e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.428770e-01 1.324656e-02 4.794123e-03 4.758801e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.465764e-01 1.039426e-02 3.947086e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.479050e-01 9.387148e-03 3.625567e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.477657e-01 9.504744e-03 3.647261e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.523790e-01 6.297164e-03 2.241597e-03 0.000000e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 71.236 secs"
## [1] 2.000000000 0.937993009 0.017054669 0.005870036 0.922882427 0.000000000
## [1] 737.00000000 0.94649912 0.01050683 0.00391176 0.00000000
## [6] 0.00000000
## [1] 1.472000e+03 9.414193e-01 1.424447e-02 5.253974e-03 1.389594e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.402833e-01 1.498247e-02 5.651974e-03 9.090909e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.405978e-01 1.478139e-02 5.538530e-03 1.384324e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.411119e-01 1.434713e-02 5.458687e-03 1.382576e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.362304e-01 1.841741e-02 6.269895e-03 9.136421e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.420556e-01 1.379100e-02 5.071160e-03 9.136421e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.372267e-01 1.781254e-02 5.878466e-03 9.147870e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.339309e-01 1.991022e-02 7.076595e-03 9.182390e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 72.807 secs"
## [1] 2.000000000 0.938113298 0.016934380 0.005870036 0.000000000 0.000000000
## [1] 7.370000e+02 9.446954e-01 1.191023e-02 4.312038e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.436102e-01 1.276107e-02 4.546453e-03 4.840849e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.436312e-01 1.253966e-02 4.746825e-03 9.530026e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.381188e-01 1.659446e-02 6.204419e-03 9.605263e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.324246e-01 2.068535e-02 7.807735e-03 4.853723e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.389675e-01 1.589635e-02 6.053833e-03 5.041436e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.380280e-01 1.717206e-02 5.717645e-03 9.746328e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.320837e-01 2.134408e-02 7.489918e-03 9.517601e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.347836e-01 1.896892e-02 7.165152e-03 9.383033e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 72.955 secs"
## [1] 2.000000000 0.937916177 0.017131501 0.005870036 1.843434343 0.000000000
## [1] 7.370000e+02 9.394610e-01 1.579236e-02 5.664325e-03 9.335038e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.398447e-01 1.535750e-02 5.715503e-03 4.727979e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.387940e-01 1.626389e-02 5.859834e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.418029e-01 1.406254e-02 5.052293e-03 9.592641e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.408651e-01 1.498898e-02 5.063675e-03 2.410832e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.420547e-01 1.387456e-02 4.988497e-03 1.463904e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.498220e-01 8.098954e-03 2.996793e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.491058e-01 8.868538e-03 2.943401e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.494329e-01 8.432191e-03 3.052619e-03 4.655612e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 72.018 secs"
## [1] 2.000000000 0.937806548 0.017241131 0.005870036 0.000000000 0.000000000
## [1] 7.370000e+02 9.266840e-01 2.507510e-02 9.158619e-03 2.298489e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.391882e-01 1.568970e-02 6.039765e-03 9.205549e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.415743e-01 1.425392e-02 5.089454e-03 9.443726e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.438571e-01 1.247399e-02 4.586656e-03 9.079602e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.466662e-01 1.043451e-02 3.817034e-03 9.275731e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.444941e-01 1.226461e-02 4.158978e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.501976e-01 7.707313e-03 3.012767e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.509242e-01 7.250067e-03 2.743465e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.489112e-01 8.633443e-03 3.373096e-03 0.000000e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 36.348 secs"
## [1] 2.000000000 0.937802078 0.017245600 0.005870036 0.000000000 0.000000000
## [1] 7.370000e+02 9.326059e-01 2.066730e-02 7.644533e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.295670e-01 2.301951e-02 8.331239e-03 1.862245e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.388112e-01 1.655598e-02 5.550485e-03 9.299363e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.329360e-01 2.053707e-02 7.444623e-03 1.372180e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.356599e-01 1.872161e-02 6.536165e-03 4.562500e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.393568e-01 1.579163e-02 5.769308e-03 9.263959e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.380146e-01 1.693450e-02 5.968573e-03 4.661558e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.307728e-01 2.218504e-02 7.959909e-03 2.281250e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.261455e-01 2.553316e-02 9.239033e-03 8.968059e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 72.803 secs"
## [1] 2.000000000 0.938109335 0.016938344 0.005870036 1.477732794 0.000000000
## [1] 7.370000e+02 9.462491e-01 1.058312e-02 4.085539e-03 9.811828e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.475303e-01 9.878088e-03 3.509295e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.462744e-01 1.083537e-02 3.807893e-03 1.431373e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.488855e-01 8.807959e-03 3.224259e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.514765e-01 6.871997e-03 2.569170e-03 4.721863e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.516637e-01 6.877652e-03 2.376373e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.537310e-01 5.195849e-03 1.990863e-03 4.579674e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.542535e-01 4.839139e-03 1.825064e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.552773e-01 4.172819e-03 1.467588e-03 4.631980e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 70.323 secs"
## [1] 2.000000000 0.938139311 0.016908367 0.005870036 1.363636364 0.000000000
## [1] 7.370000e+02 9.436287e-01 1.274357e-02 4.545454e-03 9.431525e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.473742e-01 9.986031e-03 3.557464e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.486448e-01 9.078603e-03 3.194313e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.510047e-01 7.225494e-03 2.687548e-03 4.691517e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.476797e-01 9.722557e-03 3.515493e-03 4.815303e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.429700e-01 1.305331e-02 4.894417e-03 4.777487e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.385152e-01 1.629715e-02 6.105325e-03 1.450331e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.378777e-01 1.687732e-02 6.162723e-03 4.802632e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.393386e-01 1.571528e-02 5.863851e-03 1.458056e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 71.362 secs"
end_time <- Sys.time()
paralleltime <- end_time - start_time
Just to prove a point, check out how much faster your code ran in parallel versus sequentially:
sequentialtime
## Time difference of 1.888728 mins
paralleltime
## Time difference of 2.062363 mins
Now let’s visualize our plots of the proportion of hrp2 deleted parasites over time.
# note, I wrote this function for you, it returns a ggplot
plot.scenario1.run1 <- plothrp2models(pop1 = hrp2map$results[[1]],
pop2 = hrp2map$results[[2]],
pop3 = hrp2map$results[[3]],
labels = c("Only RDT use", "50% RDT use", "0% RDT use")) # you need to manually add the labels
# you can manually add a title like this too
plot.scenario1.run1 <- plot.scenario1.run1 + ggtitle("hrp2 Simulation Scenario 1, Run 1")
# now consider run 2
plot.scenario1.run2 <- plothrp2models(pop1 = hrp2map$results[[4]],
pop2 = hrp2map$results[[5]],
pop3 = hrp2map$results[[6]],
labels = c("Only RDT use", "50% RDT use", "0% RDT use"))
plot.scenario1.run2 <- plot.scenario1.run2 + ggtitle("hrp2 Simulation Scenario 1, Run 2")
# now consider run 3
plot.scenario1.run3 <- plothrp2models(pop1 = hrp2map$results[[7]],
pop2 = hrp2map$results[[8]],
pop3 = hrp2map$results[[9]],
labels = c("Only RDT use", "50% RDT use", "0% RDT use"))
plot.scenario1.run3 <- plot.scenario1.run3 + ggtitle("hrp2 Simulation Scenario 1, Run 3")
# Plot them together
cowplot::plot_grid(plot.scenario1.run1,
plot.scenario1.run2,
plot.scenario1.run3,
ncol=1)
How much variation did you appreciate between runs? __________________________________
Was there anything unusual about your plots? Why was it unusual? _______________________________
Try running the models for a sample size of 100,000 (like the eLife paper). Note, this will take some time but the larger N should elicit a clearer/more consistent pattern.
Again, imagine that there are three populations with a constant EIR of 1 (reminder, EIR is the entomological inoculation rate, or the number of infectious mosquito bites per person per year). In addition, all populations start with approximately 10% frequency of hrp2 deletions and all populations only use RDTs for diagnostics. However, the populations differ by:
Population 1:
Population 2:
Population 3:
# going to overwrite our old hrp2map and make a new one
hrp2map <- data.frame(
run = c(1,1,1,2,2,2),
N=2000,
years=20,
strains.0 = 10,
rdt.nonadherence = 0,
EIR=1/365,
microscopy.use = 0,
rdt.det = 0,
fitness = c(0.5, 1, 1.5, 0.5, 1, 1.5)
)
hrp2map$results <- furrr::future_pmap(hrp2map[,c("N", "years", "strains.0", "rdt.nonadherence", "EIR", "microscopy.use", "rdt.det", "fitness")],
hrp2malaRia::hrp2_Simulation)
## [1] 2.000000000 0.937880420 0.017167259 0.005870036 0.475880052 0.000000000
## [1] 7.370000e+02 9.358669e-01 1.852343e-02 6.527393e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.355546e-01 1.847213e-02 6.890933e-03 9.383033e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.395899e-01 1.582276e-02 5.505091e-03 1.393130e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.448582e-01 1.190821e-02 4.151340e-03 1.379093e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.455246e-01 1.093836e-02 4.454800e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.448198e-01 1.210325e-02 3.994618e-03 4.626109e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.454009e-01 1.133661e-02 4.180228e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.407107e-01 1.491021e-02 5.296794e-03 4.451220e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.402042e-01 1.539070e-02 5.322835e-03 9.090909e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 58.379 secs"
## [1] 2.000000000 0.937675888 0.017371790 0.005870036 0.463786531 0.000000000
## [1] 7.370000e+02 9.305199e-01 2.247127e-02 7.926524e-03 4.473039e-01
## [6] 2.807692e+01
## [1] 1.472000e+03 9.323406e-01 2.101782e-02 7.559327e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.308617e-01 2.205894e-02 7.997089e-03 1.367041e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.297003e-01 2.281160e-02 8.405776e-03 4.596977e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.252118e-01 2.633644e-02 9.369431e-03 2.324841e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.252974e-01 2.602692e-02 9.593349e-03 9.468223e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.280622e-01 2.389288e-02 8.962687e-03 1.425781e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.312428e-01 2.182487e-02 7.850065e-03 9.554974e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.289003e-01 2.366435e-02 8.353103e-03 1.901042e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 58.982 secs"
## [1] 2.000000000 0.937896625 0.017151054 0.005870036 1.423927178 0.000000000
## [1] 7.370000e+02 9.460667e-01 1.075123e-02 4.099826e-03 2.385621e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.344377e-01 1.966164e-02 6.818419e-03 2.404480e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.358108e-01 1.839600e-02 6.710885e-03 4.945799e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.345697e-01 1.933488e-02 7.013175e-03 1.422078e+00
## [6] 2.281250e+01
## [1] 3.677000e+03 9.387914e-01 1.618095e-02 5.945351e-03 4.771242e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.381058e-01 1.677199e-02 6.039953e-03 9.182390e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.345114e-01 1.923666e-02 7.169698e-03 1.375628e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.300891e-01 2.260855e-02 8.220078e-03 4.697555e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.290805e-01 2.359091e-02 8.246274e-03 0.000000e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 58.321 secs"
## [1] 2.000000000 0.938131085 0.016916593 0.005870036 0.905707196 0.000000000
## [1] 7.370000e+02 9.441675e-01 1.214409e-02 4.606153e-03 4.579674e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.467722e-01 1.019846e-02 3.947062e-03 9.090909e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.459101e-01 1.089988e-02 4.107750e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.528423e-01 5.912049e-03 2.163331e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.540192e-01 5.074824e-03 1.823720e-03 4.445798e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.513227e-01 7.044500e-03 2.550489e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.499063e-01 7.975236e-03 3.036197e-03 8.869988e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.534952e-01 5.479154e-03 1.943340e-03 4.434994e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.517694e-01 6.674433e-03 2.473871e-03 9.023486e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 56.522 secs"
## [1] 2.000000000 0.937770584 0.017277094 0.005870036 1.852791878
## [6] 33.181818182
## [1] 7.370000e+02 9.423227e-01 1.377049e-02 4.824532e-03 4.740260e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.425903e-01 1.340750e-02 4.919912e-03 4.631980e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.353645e-01 1.898311e-02 6.570116e-03 2.278402e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.377276e-01 1.685320e-02 6.336939e-03 1.857506e+00
## [6] 3.318182e+01
## [1] 3.677000e+03 9.369633e-01 1.744207e-02 6.512304e-03 1.402049e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.313053e-01 2.157479e-02 8.037659e-03 9.383033e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.348300e-01 1.897195e-02 7.115783e-03 1.896104e+00
## [6] 2.027778e+01
## [1] 5.882000e+03 9.356115e-01 1.829820e-02 7.008022e-03 4.796321e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.342203e-01 1.954126e-02 7.156189e-03 1.869398e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 58.588 secs"
## [1] 2.000000000 0.938118728 0.016928951 0.005870036 0.000000000 0.000000000
## [1] 7.370000e+02 9.367177e-01 1.792428e-02 6.275747e-03 1.423927e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.370264e-01 1.724262e-02 6.648713e-03 1.425781e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.346493e-01 1.931133e-02 6.957062e-03 1.420233e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.372285e-01 1.719770e-02 6.491561e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.363968e-01 1.791647e-02 6.604464e-03 9.580052e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.378270e-01 1.695242e-02 6.138308e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.396067e-01 1.554566e-02 5.765330e-03 1.427640e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.395068e-01 1.557861e-02 5.832318e-03 1.420233e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.339625e-01 1.966110e-02 7.294067e-03 4.752604e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 58.686 secs"
plot.scenario2.run1 <- plothrp2models(pop1 = hrp2map$results[[1]],
pop2 = hrp2map$results[[2]],
pop3 = hrp2map$results[[3]],
labels = c("hrp2 Fitness 0.5", "hrp2 Fitness 1", "hrp2 Fitness 1.5"))
plot.scenario2.run1 <- plot.scenario2.run1 + ggtitle("hrp2 Simulation Scenario 2, Run 1")
plot.scenario2.run2 <- plothrp2models(pop1 = hrp2map$results[[4]],
pop2 = hrp2map$results[[5]],
pop3 = hrp2map$results[[6]],
labels = c("hrp2 Fitness 0.5", "hrp2 Fitness 1", "hrp2 Fitness 1.5"))
plot.scenario2.run2 <- plot.scenario2.run2 + ggtitle("hrp2 Simulation Scenario 2, Run 2")
cowplot::plot_grid(plot.scenario2.run1, plot.scenario2.run2,ncol=1)
Again, imagine that there are three populations where all populations start with approximately 10% frequency of hrp2 deletions, hrp2-deleted and wildtype strains have the same fitness, and all populations only use RDTs for diagnostics. However, the populations differ by:
Population 1:
Population 2:
Population 3:
# going to overwrite our old hrp2map and make a new one
hrp2map <- data.frame(scenario = "scnr3",
population = c("pop1", "pop2", "pop3", "pop1", "pop2", "pop3"),
run = c(1,1,1,2,2,2),
N=2000,
years=20,
strains.0 = 10,
rdt.nonadherence = 0,
EIR=c(1/365, 10/365, 100/365, 1/365, 10/365, 100/365),
microscopy.use = 0,
rdt.det = 0,
fitness = 1
)
hrp2map$results <- furrr::future_pmap(hrp2map[,c("N", "years", "strains.0", "rdt.nonadherence", "EIR", "microscopy.use", "rdt.det", "fitness")],
hrp2malaRia::hrp2_Simulation)
## [1] 2.000000000 0.938062650 0.016985028 0.005870036 0.466751918 0.000000000
## [1] 7.370000e+02 9.332143e-01 2.049986e-02 7.203522e-03 4.667519e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.378587e-01 1.696339e-02 6.095624e-03 1.438896e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.338092e-01 1.993720e-02 7.171302e-03 1.437008e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.276029e-01 2.446566e-02 8.849182e-03 4.740260e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.290152e-01 2.327101e-02 8.631510e-03 4.765013e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.265831e-01 2.495267e-02 9.381969e-03 9.346991e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.244456e-01 2.669770e-02 9.774449e-03 1.848101e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.286924e-01 2.359720e-02 8.628153e-03 4.703608e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.308742e-01 2.212872e-02 7.914828e-03 1.407455e+00
## [6] 3.650000e+01
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 58.583 secs"
## [1] 2.00000000 4.73924343 0.18200596 0.05870036 10.08552632 0.00000000
## [1] 737.00000000 4.72204553 0.18911037 0.06879384 7.11963589
## [6] 33.18181818
## [1] 1.472000e+03 4.723458e+00 1.869436e-01 6.954863e-02 9.903101e+00
## [6] 4.562500e+01
## [1] 2.207000e+03 4.728257e+00 1.854155e-01 6.627773e-02 1.030809e+01
## [6] 3.318182e+01
## [1] 2.942000e+03 4.724901e+00 1.864322e-01 6.861657e-02 1.106061e+01
## [6] 0.000000e+00
## [1] 3677.0000000 4.7372785 0.1777422 0.0649290 8.1716418
## [6] 20.2777778
## [1] 4.412000e+03 4.716510e+00 1.935651e-01 6.987458e-02 7.785445e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 4.718370e+00 1.920891e-01 6.949051e-02 7.318296e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 4.720333e+00 1.906057e-01 6.901117e-02 1.017744e+01
## [6] 2.807692e+01
## [1] 6.617000e+03 4.727345e+00 1.839619e-01 6.864267e-02 8.212500e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 61.435 secs"
## [1] 2.0000000 42.6553671 1.9278993 0.5870036 72.1400524 104.2857143
## [1] 737.0000000 42.4551214 1.9934233 0.7217252 76.3618421 78.2142857
## [1] 1472.0000000 42.4658225 1.9835096 0.7209379 79.1392904
## [6] 78.2142857
## [1] 2207.0000000 42.4187713 2.0152221 0.7362766 70.5666667
## [6] 36.5000000
## [1] 2942.000000 42.410492 2.025317 0.734461 69.183007 0.000000
## [1] 3677.0000000 42.4439460 1.9930017 0.7333222 79.2434211
## [6] 56.1538462
## [1] 4412.0000000 42.3597724 2.0702581 0.7402395 80.9533074
## [6] 104.2857143
## [1] 5147.0000000 42.3378093 2.0706755 0.7617852 73.3668342
## [6] 52.1428571
## [1] 5882.0000000 42.3850642 2.0392527 0.7459531 74.3860759
## [6] 156.4285714
## [1] 6617.000000 42.347591 2.057942 0.764737 64.665354 97.333333
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 64.949 secs"
## [1] 2.000000000 0.937916126 0.017131552 0.005870036 0.974632844 0.000000000
## [1] 7.370000e+02 9.365093e-01 1.778042e-02 6.627985e-03 4.765013e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.357766e-01 1.834104e-02 6.800057e-03 1.405648e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.364182e-01 1.797209e-02 6.527382e-03 4.602774e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.304741e-01 2.232570e-02 8.117910e-03 1.423927e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.292778e-01 2.315344e-02 8.486480e-03 1.910995e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.238639e-01 2.756527e-02 9.488524e-03 2.935657e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.264893e-01 2.514220e-02 9.286263e-03 4.790026e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.210457e-01 2.902277e-02 1.084921e-02 2.426862e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.261356e-01 2.556113e-02 9.220999e-03 1.427640e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 58.717 secs"
## [1] 2.00000000 4.73577449 0.18547490 0.05870036 7.60416667 0.00000000
## [1] 737.00000000 4.69113409 0.21200994 0.07680571 8.00645161
## [6] 0.00000000
## [1] 1.472000e+03 4.710380e+00 1.981550e-01 7.141472e-02 8.337563e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 4.696873e+00 2.070172e-01 7.605996e-02 1.249049e+01
## [6] 0.000000e+00
## [1] 2.942000e+03 4.707389e+00 2.004991e-01 7.206145e-02 1.387833e+01
## [6] 2.807692e+01
## [1] 3677.0000000 4.7100353 0.1976881 0.0722263 9.3112245
## [6] 0.0000000
## [1] 4.412000e+03 4.708275e+00 1.988245e-01 7.285007e-02 9.864865e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 4.717344e+00 1.920883e-01 7.051695e-02 1.320413e+01
## [6] 0.000000e+00
## [1] 5.882000e+03 4.705526e+00 2.032337e-01 7.118963e-02 1.222938e+01
## [6] 0.000000e+00
## [1] 6617.0000000 4.6980781 0.2074973 0.0743743 9.3112245
## [6] 0.0000000
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 61.663 secs"
## [1] 2.0000000 42.6368754 1.9463910 0.5870036 74.9337748 28.0769231
## [1] 737.000000 42.157668 2.201757 0.810845 80.309618 104.285714
## [1] 1472.0000000 42.1425733 2.2174143 0.8102824 72.4128686
## [6] 99.5454545
## [1] 2207.0000000 42.0362058 2.2998826 0.8341815 80.1219512
## [6] 136.8750000
## [1] 2942.0000000 42.2479283 2.1432675 0.7790742 74.9104683
## [6] 66.3636364
## [1] 3677.0000000 42.2928851 2.1101388 0.7672461 69.5704698
## [6] 33.1818182
## [1] 4412.0000000 42.2503483 2.1400656 0.7798561 69.8829201
## [6] 162.2222222
## [1] 5147.0000000 42.2801160 2.1204898 0.7696641 84.7233468
## [6] 219.0000000
## [1] 5882.0000000 42.2632834 2.1317238 0.7752628 74.7708895
## [6] 40.5555556
## [1] 6617.0000000 42.3409661 2.0698572 0.7594467 69.9343832
## [6] 26.0714286
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 65.457 secs"
plot.scenario3.run1 <- plothrp2models(pop1 = hrp2map$results[[1]],
pop2 = hrp2map$results[[2]],
pop3 = hrp2map$results[[3]],
labels = c("EIR 1", "EIR 10", "EIR 100"))
plot.scenario3.run1 <- plot.scenario3.run1 + ggtitle("hrp2 Simulation Scenario 3, Run 1")
plot.scenario3.run2 <- plothrp2models(pop1 = hrp2map$results[[4]],
pop2 = hrp2map$results[[5]],
pop3 = hrp2map$results[[6]],
labels = c("EIR 1", "EIR 10", "EIR 100"))
plot.scenario3.run2 <- plot.scenario3.run2 + ggtitle("hrp2 Simulation Scenario 3, Run 2")
cowplot::plot_grid(plot.scenario3.run1, plot.scenario3.run2,ncol=1)
You’ve now started to answer our Ministry of Health Official’s question, as we have shown that regions of low malaria transmission (low EIR) and high proportion of RDT use are predictive of hrp2-deletion fixation (Watson et al. 2017). In addition, we have shown that a large fitness cost associated with the hrp2-deletion (and a low starting frequency of the hrp2-deletion strain proportion) leads to rapid decay of the hrp2-deleted phenotype (Watson et al. 2017). As a result, we think that these hrp2-deleted parasites are viable and potentially problematic in low-endemicity areas (of note, OJ’s 2017 predictions have been pretty accurate to date, particularly in Eritrea and other regions in Northern Africa).
However, our Ministry of Health Official asks the next logical question: “Should I no longer be using RDTs”?
It is now a good time to note that this question mixes prediction and causal inference, both vast fields of study (the latter being the UNC Epi Department’s forte). We are going to glaze over a huge number of assumptions and caveats (which I apologize for).
Up until this point, we have been modeling the frequency (or proportion) of hrp2 parasites in each population. However, our question now asks about the utility of RDTs in a setting of hrp2 deletions. As such, we will take the “worst” possible scenario and compare two populations in a low EIR setting, where one population only uses RDT (and seeks treatment based on the result) and the other only uses microscopy (and also seeks treatment). We will then focus in on differences in incidence between these two populations.
# I am only going to do one run, but feel free to do more
incidencemap <- data.frame(run = 1,
N=2000,
years=20,
strains.0 = 10,
rdt.nonadherence = 0,
EIR=1/365,
microscopy.use = c(0,1),
rdt.det = 0,
fitness = 1)
incidencemap$results <- furrr::future_pmap(incidencemap[,c("N", "years", "strains.0", "rdt.nonadherence", "EIR", "microscopy.use", "rdt.det", "fitness")],
hrp2malaRia::hrp2_Simulation)
## [1] 2.000000000 0.938077617 0.016970061 0.005870036 1.380832282 0.000000000
## [1] 7.370000e+02 9.369766e-01 1.754639e-02 6.394711e-03 1.409266e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.325998e-01 2.069151e-02 7.626408e-03 9.554974e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.322743e-01 2.108312e-02 7.560305e-03 2.433333e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.346138e-01 1.929417e-02 7.009789e-03 9.681698e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.311059e-01 2.189642e-02 7.915389e-03 1.429504e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.391340e-01 1.585869e-02 5.925016e-03 9.205549e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.364444e-01 1.789061e-02 6.582744e-03 4.614412e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.440127e-01 1.253737e-02 4.367675e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.387442e-01 1.612428e-02 6.049199e-03 9.287532e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 38.472 secs"
## [1] 2.000000000 0.937969375 0.017078303 0.005870036 0.000000000 0.000000000
## [1] 7.370000e+02 9.451041e-01 1.158966e-02 4.223977e-03 4.424242e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.491678e-01 8.318500e-03 3.431447e-03 4.579674e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.461855e-01 1.066867e-02 4.063513e-03 4.478528e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.416629e-01 1.414417e-02 5.110676e-03 4.445798e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.436262e-01 1.263243e-02 4.659044e-03 4.534161e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.508927e-01 7.385033e-03 2.639994e-03 9.136421e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.493611e-01 8.426921e-03 3.129676e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.509907e-01 7.156774e-03 2.770212e-03 9.707447e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.491489e-01 8.625576e-03 3.143214e-03 0.000000e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of 7350 days at 1 day intervals within 36.998 secs"
Let’s now pull out the incidence data.
ret <- EMSI2019hrp2lab::extracthrp2results(pop1 = incidencemap$results[[1]],
pop2 = incidencemap$results[[2]])
We can open it up to see what we extracted. Note, the pfinc column is from the hrp2malaRia .$S.Incidence column.
glimpse(ret)
## Observations: 480
## Variables: 5
## $ pfinc <dbl> 0.0004000000, 0.0003500000, 0.0004166667, 0.0004000…
## $ hrp2prev <dbl> 0.08888555, 0.09215263, 0.09355403, 0.09229883, 0.0…
## $ hrp2prevmono <dbl> 0.03331232, 0.03490653, 0.04072324, 0.05271786, 0.0…
## $ time <dbl> 30, 60, 91, 121, 152, 183, 213, 244, 275, 305, 336,…
## $ pop <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
ret %>%
ggplot(.) +
geom_line(aes(x=time, y=pfinc, group = factor(pop), colour=factor(pop))) +
xlab("Times (Days)") + ylab("Incidence") +
scale_color_manual(name = "", labels = c("Only RDT use", "Only Microscopy Use"), values = RColorBrewer::brewer.pal(8, "Set1")) +
theme_minimal() +
theme(
plot.title = element_text(hjust=0.5),
legend.position = "bottom"
)
To answer this question, it may help to subset to just a year of “time”. You could do this by adding the following to the code chunk above:
ret %>% dplyr::filter(time < 365) %>% # this would be the first year ggplot(.) + ...
Without going into too much detail, incidence is indexed by time (this is an “open population”“, so we can’t use”risk“). As a result, we can model the incidence difference between these two populations at each time-point in order to get a sense of how many more infections the”Only RDT Use" population has as compared to “Only Microscopy Use”.
days <- seq(0, (365*20))
pop1 <- ret %>%
dplyr::filter(pop == 1)
pop2 <- ret %>%
dplyr::filter(pop == 2)
# quick local linear interpolation to get an incidence for every single day of the 20 years (not just the S.times)
pop1 <- approx(x = pop1$time, y = pop1$pfinc, xout = days, method = "linear") %>%
cbind.data.frame() %>%
dplyr::rename(time = x, pfinc = y) %>%
dplyr::mutate(pop = 1)
pop2 <- approx(x = pop2$time, y = pop2$pfinc, xout = days, method = "linear") %>%
cbind.data.frame() %>%
dplyr::rename(time = x, pfinc = y) %>%
dplyr::mutate(pop = 2)
# now let's rejoin and plot the difference
left_join(pop1, pop2, by = "time") %>%
dplyr::mutate(diff = pfinc.x - pfinc.y) %>%
dplyr::filter(!is.na(diff)) %>% # get rid of boundary issues ("cheating, we could do better")
ggplot() +
geom_line(aes(x=time, y=diff), color = "blue") +
geom_hline(yintercept = 0, color = "red") +
xlab("Times (Days)") + ylab("Incidence") +
ggtitle("Incidence Difference between a Population Using only RDTs versus Microscopy") +
theme_minimal() +
theme(
plot.title = element_text(hjust=0.5),
legend.position = "bottom"
)
On your own, alter the simulation (e.g. vary EIR, etc.). How does the incidence difference plot change? What should you tell your Ministry of Health Official friend?
One thing to consider is the “cost” of a missed infection (e.g. an individual with hrp2 deleted parasites that was misdiagnosed by RDT, a “false-negative”), which is not reflected in the incidence plot above. Let’s look at that below (under the assumption that no cases were missed by microscopy under our model).
ret %>%
dplyr::filter(pop == 1) %>% # assume microscpy had no missed cases
dplyr::mutate(missedprop = pfinc * hrp2prevmono) %>%
ggplot() +
geom_line(aes(x=time, y=missedprop), color = "blue") +
xlab("Times (Days)") + ylab("hrp2-only Incidence") +
ggtitle("Incidence of hrp2 only infected individuals (missed case proportion)") +
theme_minimal() +
theme(
plot.title = element_text(hjust=0.5),
legend.position = "bottom"
)
Now taking into account the “cost” of a potential missed infection, does your response to our Ministry of Health friend change?
Thank you for completing this lab!!